智能论文笔记

AMI-FML: A Privacy-Preserving Federated Machine Learning Framework for AMI

Milan Biswal , Abu Saleh Md Tayeen , Satyajayant Misra

分类：机器学习

2021-09-13

基于机器学习（ML）的智能仪表数据分析对于先进的计量基础设施（AMI）中的能源管理和需求 - 响应应用非常有前途。开发AMI的分布式ML应用程序中的一个关键挑战是保留用户隐私，同时允许有效的最终用户参与。本文解决了这一挑战，并为AMI中的ML应用程序提出了隐私保留的联合学习框架。我们将每个智能仪表视为托管使用中央聚合器或数据集中器的信息的ML应用程序的联邦边缘设备。而不是传输智能仪表感测的原始数据，ML模型权重被传送到聚合器以保护隐私。聚合器处理这些参数以设计可以在每个边缘设备处替换的鲁棒ML模型。我们还讨论了在共享ML模型参数的同时提高隐私和提高通信效率的策略，适用于AMI中的网络连接相对较慢。我们展示了在联合案例联盟ML（FML）应用程序上的提议框架，其提高了短期负荷预测（STLF）。我们使用长期内存（LSTM）经常性神经网络（RNN）模型进行STLF。在我们的体系结构中，我们假设有一个聚合器连接到一组智能电表。聚合器使用从联合智能仪表接收的学习模型渐变，以生成聚合，鲁棒RNN模型，其提高了个人和聚合STLF的预测精度。我们的结果表明，通过FML，预测精度增加，同时保留最终用户的数据隐私。

translated by 谷歌翻译

A Dependable Hybrid Machine Learning Model for Network Intrusion Detection

Md. Alamin Talukder , Khondokar Fida Hasan , Md. Manowarul Islam , Md Ashraf Uddin , Arnisha Akhter , Mohammand Abu Yousuf , Fares Alharbi , Mohammad Ali Moni

分类：机器学习

2022-12-08

Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.

translated by 谷歌翻译

Impact Learning: A Learning Method from Features Impact and Competition

Nusrat Jahan Prottasha , Saydul Akbar Murad , Abu Jafar Md Muzahid , Masud Rana , Md Kowsher , Apurba Adhikary , Sujit Biswas , Anupam Kumar Bairagi

分类：机器学习 | 人工智能

2022-11-04

Machine learning is the study of computer algorithms that can automatically improve based on data and experience. Machine learning algorithms build a model from sample data, called training data, to make predictions or judgments without being explicitly programmed to do so. A variety of wellknown machine learning algorithms have been developed for use in the field of computer science to analyze data. This paper introduced a new machine learning algorithm called impact learning. Impact learning is a supervised learning algorithm that can be consolidated in both classification and regression problems. It can furthermore manifest its superiority in analyzing competitive data. This algorithm is remarkable for learning from the competitive situation and the competition comes from the effects of autonomous features. It is prepared by the impacts of the highlights from the intrinsic rate of natural increase (RNI). We, moreover, manifest the prevalence of the impact learning over the conventional machine learning algorithm.

translated by 谷歌翻译

Common human diseases prediction using machine learning based on survey data

Jabir Al Nahian , Abu Kaisar Mohammad Masum , Sheikh Abujar , Md. Jueal Mia

分类：机器学习

2022-09-22

在这个时代，作为医疗的主要重点，这一时刻已经到来了。尽管令人印象深刻，但已经开发出来检测疾病的多种技术。此时，有一些类型的疾病COVID-19，正常烟，偏头痛，肺病，心脏病，肾脏疾病，糖尿病，胃病，胃病，胃病，骨骼疾病，自闭症是非常常见的疾病。在此分析中，我们根据疾病的症状进行了分析疾病症状的预测。我们研究了一系列症状，并接受了人们的调查以完成任务。已经采用了几种分类算法来训练模型。此外，使用性能评估矩阵来衡量模型的性能。最后，我们发现零件分类器超过了其他分类器。

translated by 谷歌翻译

A Survey of Recommender System Techniques and the Ecommerce Domain

Imran Hossain , Md Aminul Haque Palash , Anika Tabassum Sejuty , Noor A Tanjim , MD Abdullah AL Nasim , Sarwar Saif , Abu Bokor Suraj

分类：人工智能

2022-08-15

在这个大数据时代，当前一代很难从在线平台中包含的大量数据中找到正确的数据。在这种情况下，需要一个信息过滤系统，可以帮助他们找到所需的信息。近年来，出现了一个称为推荐系统的研究领域。推荐人变得重要，因为他们拥有许多现实生活应用。本文回顾了推荐系统在电子商务，电子商务，电子资源，电子政务，电子学习和电子生活中的不同技术和发展。通过分析有关该主题的最新工作，我们将能够详细概述当前的发展，并确定建议系统中的现有困难。最终结果为从业者和研究人员提供了对建议系统及其应用的必要指导和见解。

translated by 谷歌翻译

Statistical Properties of the log-cosh Loss Function Used in Machine Learning

Resve A. Saleh , A. K. Md. Ehsanes Saleh

分类： (统计)机器学习 | 机器学习

2022-08-09

本文分析了机器学习中使用的流行损失函数，称为log-cosh损失函数。已经使用此损失函数发表了许多论文，但迄今为止，文献中尚未介绍统计分析。在本文中，我们介绍了对日志cosh损失的分布函数。我们将其与类似的分布进行比较，称为Cauchy分布，并执行了特征其性质的各种统计程序。特别是，我们检查了其相关的PDF，CDF，似然函数和Fisher信息。并排考虑具有渐近偏置，渐近方差和置信区间的位置参数的MLE的cauchy和COSH分布。我们还提供了来自其他几个损失函数的强大估计器的比较，包括Huber损失函数和等级分散函数。此外，我们检查了对数字-COSH函数在分位数回归中的使用。特别是，我们确定了一个分位数分布函数，可以从中得出最大似然估计量。最后，我们将基于log-cosh的分位数m静态器与稳健的单调性与基于卷积平滑的另一种分位回归方法进行比较。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

An Opinion Mining of Text in COVID-19 Issues along with Comparative Study in ML, BERT & RNN

Md. Mahadi Hasan Sany , Mumenunnesa Keya , Sharun Akter Khushbu , Akm Shahariar Azad Rabby , Abu Kaisar Mohammad Masum

分类：神经与进化计算 | 自然语言处理

2022-01-06

全球世界正在穿越大流行形势，这是一个灾难性的呼吸综合征爆发被认为是Covid-19。这是212个国家的全球威胁，即人们每天都会遇到强大的情况。相反，成千上万的受感染的人居住丰富的山脉。心理健康也受到全球冠状病毒情况的影响。由于这种情况，在线消息来源使普通人在任何议程中分享他们的意见。如受影响的新闻相关的积极和消极，财务问题，国家和家庭危机，缺乏进出口盈利系统等。不同的情况是最近在任何地方的时尚新闻。因此，在瞬间内产生了大量的文本，在次大陆领域，与其他国家的情况相同，以及文本的人民意见和情况也是相同的，但语言是不同的。本文提出了一些具体的投入以及来自个别来源的孟加拉文本评论，可以确保插图的目标，即机器学习结果能够建立辅助系统。意见挖掘辅助系统可能以可能的所有语言偏好有影响。据我们所知，文章预测了Covid-19问题上的Bangla输入文本，提出了ML算法和深度学习模型分析还通过比较分析检查未来可达性。比较分析规定了关于文本预测精度的报告与ML算法和79％以及深度学习模型以及79％的报告。

translated by 谷歌翻译

Solution to the Non-Monotonicity and Crossing Problems in Quantile Regression

Resve A. Saleh , A. K. Md. Ehsanes Saleh

分类： (统计)机器学习 | 机器学习

2021-11-08

本文提出了一种解决估计条件和结构分体函数估计缺乏单调性的长期问题的新方法，也称为定量交叉问题。分位数回归是一般和经济学中的数据科学中的一个非常强大的工具。不幸的是，横穿问题一直混淆研究人员和从业者，以40多年了。已经进行了许多尝试来查找可接受的解决方案，但未发现任何简单和一般的解决方案。本文介绍了基于单个数学方程式的问题的优雅解决方案，该方程易于理解和实现在R和Python中，同时大大减少了交叉问题。在定期回归经常使用的所有领域，也可能在强大的回归中找到应用程序，尤其是在机器学习的背景下，这将是非常重要的。

translated by 谷歌翻译

Data Augmentation using Transformers and Similarity Measures for Improving Arabic Text Classification

Dania Refai , Saleh Abo-Soud , Mohammad Abdel-Rahman

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-28

Learning models are highly dependent on data to work effectively, and they give a better performance upon training on big datasets. Massive research exists in the literature to address the dataset adequacy issue. One promising approach for solving dataset adequacy issues is the data augmentation (DA) approach. In DA, the amount of training data instances is increased by making different transformations on the available data instances to generate new correct and representative data instances. DA increases the dataset size and its variability, which enhances the model performance and its prediction accuracy. DA also solves the class imbalance problem in the classification learning techniques. Few studies have recently considered DA in the Arabic language. These studies rely on traditional augmentation approaches, such as paraphrasing by using rules or noising-based techniques. In this paper, we propose a new Arabic DA method that employs the recent powerful modeling technique, namely the AraGPT-2, for the augmentation process. The generated sentences are evaluated in terms of context, semantics, diversity, and novelty using the Euclidean, cosine, Jaccard, and BLEU distances. Finally, the AraBERT transformer is used on sentiment classification tasks to evaluate the classification performance of the augmented Arabic dataset. The experiments were conducted on four sentiment Arabic datasets, namely AraSarcasm, ASTD, ATT, and MOVIE. The selected datasets vary in size, label number, and unbalanced classes. The results show that the proposed methodology enhanced the Arabic sentiment text classification on all datasets with an increase in F1 score by 4% in AraSarcasm, 6% in ASTD, 9% in ATT, and 13% in MOVIE.

translated by 谷歌翻译